🧪 Test & implement executeBatchPayload fallback recovery#14
🧪 Test & implement executeBatchPayload fallback recovery#14hashexplaindata wants to merge 1 commit intomasterfrom
Conversation
- Added `localStorage` fallback to save `STATE.results` if `executeBatchPayload` fails.
- Key is dynamically generated using `telemetry_backup_${STATE.pid}`.
- Removed `console.error` and other logging for strict silent client mandate.
- Updated DOM status with `textContent` instead of `innerHTML` to prevent XSS.
- Created Playwright script (`telemetry_verification/verify_payload_fallback.py`) to verify the offline failure path, UI updates, `localStorage` state, and lack of console errors.
Co-authored-by: hashexplaindata <221828969+hashexplaindata@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the resilience and testability of telemetry data handling. It introduces a Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request successfully implements a robust fallback mechanism for telemetry data using localStorage when the primary sync fails. The changes in experiment.js are excellent, particularly the use of textContent to prevent XSS vulnerabilities and the silent failure handling for localStorage. A comprehensive Playwright test has been added to verify this new fallback logic. My review focuses on improving the robustness of this new test script by replacing fixed sleep intervals with more reliable waiting strategies, which will help prevent flaky tests.
| async def run_test(): | ||
| server = subprocess.Popen(["python3", "-m", "http.server", "8000"]) | ||
| # Wait for the server to start | ||
| time.sleep(2) |
There was a problem hiding this comment.
Using a fixed time.sleep() to wait for the server to start can lead to flaky tests. If the server takes longer to start on a slower machine or under load, the test will fail. A more robust approach is to implement a polling mechanism that repeatedly tries to connect to the server until it's responsive before proceeding with the test.
| for i in range(6): | ||
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | ||
| # Click the first card | ||
| await page.click(".bento-choice-card:first-child") |
There was a problem hiding this comment.
Using a fixed asyncio.sleep() within the loop can lead to flaky tests. A more robust approach is to use Playwright's auto-waiting capabilities with expect. By waiting for the trial counter to display the correct number for the current trial, you ensure the test only proceeds when the UI is actually ready for the next interaction. This avoids arbitrary wait times and makes the test more reliable. You'll need to add from playwright.async_api import expect at the top of the file.
| for i in range(6): | |
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | |
| # Click the first card | |
| await page.click(".bento-choice-card:first-child") | |
| for i in range(6): | |
| # Wait for the trial counter to update, indicating the trial is ready. | |
| await expect(page.locator("#trial-counter")).to_have_text(f"Diagnostic {i + 1}/6") | |
| # Click the first card | |
| await page.click(".bento-choice-card:first-child") |
| await asyncio.sleep(1) # wait for the failure catch block to execute | ||
|
|
||
| print("Verifying text content...") | ||
| # Check DOM update | ||
| sync_status_text = await page.inner_text("#sync-status") | ||
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." |
There was a problem hiding this comment.
Using a fixed asyncio.sleep() to wait for an asynchronous operation can make tests flaky. It's better to use Playwright's built-in waiting mechanisms. You can replace the sleep and the manual assertion with a single expect() call that waits for the element's text to match the expected value. This makes the test more reliable and concise. You'll need to add from playwright.async_api import expect at the top of the file if you haven't already.
| await asyncio.sleep(1) # wait for the failure catch block to execute | |
| print("Verifying text content...") | |
| # Check DOM update | |
| sync_status_text = await page.inner_text("#sync-status") | |
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | |
| print("Verifying text content...") | |
| # Wait for the DOM to update with the failure message and assert its content. | |
| await expect(page.locator("#sync-status")).to_have_text("Diagnostic Complete. A network timeout occurred. You may safely close this tab.") |
There was a problem hiding this comment.
Pull request overview
Implements a real executeBatchPayload failure fallback by persisting telemetry to localStorage and rendering a safe, non-innerHTML status message, plus adds an end-to-end Playwright (Python) verification script for the offline/error path.
Changes:
- Implement
executeBatchPayloadfallback recovery by savingSTATE.resultstolocalStorageon sync failure and rendering a fixed deception message viatextContent. - Remove console logging in the client to comply with the silent-client directive.
- Add a Playwright async Python script that drives the UI, simulates offline mode, and asserts DOM text +
localStoragebackup + console silence.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| telemetry_verification/verify_payload_fallback.py | Adds Playwright E2E verification of the fallback flow (offline simulation, DOM text assertion, localStorage schema checks, console silence checks). |
| code/experiment.js | Implements localStorage fallback in executeBatchPayload, removes console logging, and replaces error rendering with textContent. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| console_messages = [] | ||
|
|
||
| # Listen to console events | ||
| page.on("console", lambda msg: console_messages.append(msg.text)) | ||
|
|
||
| print("Navigating to the experiment...") | ||
| await page.goto("http://localhost:8000/code/index.html?condition=control") | ||
|
|
||
| print("Starting the trials...") | ||
| # Screen 1 -> 2 | ||
| await page.click("#btn-consent") | ||
|
|
||
| # Screen 2 -> trial | ||
| await page.click(".btn-familiarity[data-val='2']") | ||
|
|
||
| print("Completing trials...") | ||
| # 6 Trials | ||
| for i in range(6): | ||
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | ||
| # Click the first card | ||
| await page.click(".bento-choice-card:first-child") | ||
|
|
||
| print("Providing justification...") | ||
| # Screen 9 (Justification) | ||
| await page.fill("#semantic-justification", "This is a valid semantic justification.") | ||
|
|
||
| print("Simulating offline environment...") | ||
| # Set context to offline before clicking finalize to simulate network error for Firebase | ||
| await context.set_offline(True) | ||
|
|
||
| print("Finalizing...") | ||
| # Click Finalize -> Screen 10 (executeBatchPayload) | ||
| await page.click("#btn-finalize") | ||
|
|
||
| await asyncio.sleep(1) # wait for the failure catch block to execute | ||
|
|
||
| print("Verifying text content...") | ||
| # Check DOM update | ||
| sync_status_text = await page.inner_text("#sync-status") | ||
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | ||
|
|
||
| print("Verifying localStorage backup...") | ||
| # Check localStorage | ||
| local_storage = await page.evaluate("() => Object.entries(localStorage)") | ||
|
|
||
| backup_key = None | ||
| backup_value = None | ||
| for key, val in local_storage: | ||
| if key.startswith("telemetry_backup_"): | ||
| backup_key = key | ||
| backup_value = val | ||
| break | ||
|
|
||
| assert backup_key is not None, "Backup key not found in localStorage" | ||
|
|
||
| # Verify backup content | ||
| data = json.loads(backup_value) | ||
| assert len(data) == 6, f"Expected 6 rows in Tidy Data schema, got {len(data)}" | ||
| assert "semantic_justification" in data[0], "Semantic justification not appended." | ||
| assert data[0]["semantic_justification"] == "This is a valid semantic justification.", "Semantic justification incorrect." | ||
|
|
||
| print("Verifying console silence...") | ||
| # Verify console | ||
| # Ignore the 404 for firebase-config.js as it's missing from repo | ||
| # We are verifying that experiment.js doesn't log errors | ||
| errors_or_warnings = [msg for msg in console_messages if ("error" in msg.lower() or "warn" in msg.lower()) and "404" not in msg] | ||
| assert len(errors_or_warnings) == 0, f"Found console errors/warnings: {errors_or_warnings}" |
There was a problem hiding this comment.
Console-silence validation is based on substring matching against msg.text, but Playwright console entries have a severity (msg.type) and not all errors/warnings necessarily include the words "error"/"warn" in their text. Consider recording (msg.type, msg.text) and asserting no type in {"warning","error"}, and also listen to page.on("pageerror", ...) to catch uncaught exceptions.
| // Fallback: Save to localStorage for later recovery | ||
| try { | ||
| localStorage.setItem(`telemetry_backup_${STATE.pid}`, JSON.stringify(STATE.results)); | ||
| } catch (storageError) { | ||
| // Silently fail if localStorage is unavailable | ||
| } | ||
| DOM.syncStatus.textContent = "Diagnostic Complete. A network timeout occurred. You may safely close this tab."; | ||
| } |
There was a problem hiding this comment.
In the fallback path, data is stored under telemetry_backup_${STATE.pid}, but the UI never reveals the PID (and onSyncSuccess() is not called), making it hard to recover a specific participant’s backup key. Consider also displaying the PID/recovery key when the localStorage write succeeds, or storing under a stable key plus PID inside the value.
| print(f"Test Failed: {e}") | ||
| sys.exit(1) | ||
| finally: | ||
| server.terminate() |
There was a problem hiding this comment.
server.terminate() is called but the process is never wait()ed; this can leave a zombie process (especially on CI) and keep port 8000 occupied. Consider terminate() + wait(timeout=...), and fall back to kill() if needed.
| server.terminate() | |
| if server.poll() is None: | |
| server.terminate() | |
| try: | |
| server.wait(timeout=5) | |
| except subprocess.TimeoutExpired: | |
| server.kill() | |
| server.wait() |
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | ||
|
|
||
| print("Verifying localStorage backup...") | ||
| # Check localStorage | ||
| local_storage = await page.evaluate("() => Object.entries(localStorage)") | ||
|
|
||
| backup_key = None | ||
| backup_value = None | ||
| for key, val in local_storage: | ||
| if key.startswith("telemetry_backup_"): | ||
| backup_key = key | ||
| backup_value = val | ||
| break | ||
|
|
||
| assert backup_key is not None, "Backup key not found in localStorage" | ||
|
|
||
| # Verify backup content | ||
| data = json.loads(backup_value) | ||
| assert len(data) == 6, f"Expected 6 rows in Tidy Data schema, got {len(data)}" | ||
| assert "semantic_justification" in data[0], "Semantic justification not appended." | ||
| assert data[0]["semantic_justification"] == "This is a valid semantic justification.", "Semantic justification incorrect." | ||
|
|
||
| print("Verifying console silence...") | ||
| # Verify console | ||
| # Ignore the 404 for firebase-config.js as it's missing from repo | ||
| # We are verifying that experiment.js doesn't log errors | ||
| errors_or_warnings = [msg for msg in console_messages if ("error" in msg.lower() or "warn" in msg.lower()) and "404" not in msg] | ||
| assert len(errors_or_warnings) == 0, f"Found console errors/warnings: {errors_or_warnings}" |
There was a problem hiding this comment.
This script uses bare assert statements for test validation; running Python with optimizations (python -O) disables asserts and would let failures pass silently. Prefer explicit checks that raise/exit (or use a test framework like pytest) so the validations always execute.
| async def run_test(): | ||
| server = subprocess.Popen(["python3", "-m", "http.server", "8000"]) | ||
| # Wait for the server to start | ||
| time.sleep(2) |
There was a problem hiding this comment.
time.sleep(2) is called inside an async def and blocks the event loop, which can make the Playwright run flaky/slow. Prefer await asyncio.sleep(...) or (better) actively poll for the server port to accept connections before proceeding.
| time.sleep(2) | |
| await asyncio.sleep(2) |
| server = subprocess.Popen(["python3", "-m", "http.server", "8000"]) | ||
| # Wait for the server to start | ||
| time.sleep(2) | ||
|
|
||
| try: | ||
| async with async_playwright() as p: | ||
| browser = await p.chromium.launch(headless=True) | ||
| context = await browser.new_context() | ||
| page = await context.new_page() | ||
|
|
||
| console_messages = [] | ||
|
|
||
| # Listen to console events | ||
| page.on("console", lambda msg: console_messages.append(msg.text)) | ||
|
|
||
| print("Navigating to the experiment...") | ||
| await page.goto("http://localhost:8000/code/index.html?condition=control") |
There was a problem hiding this comment.
The http.server is started without setting cwd, so if this script is executed from a directory other than the repo root (e.g. telemetry_verification/), GET /code/index.html will 404 and the test will fail. Consider resolving the repo root via __file__ and starting the server with cwd=... (or serve an explicit directory).
| browser = await p.chromium.launch(headless=True) | ||
| context = await browser.new_context() | ||
| page = await context.new_page() | ||
|
|
||
| console_messages = [] | ||
|
|
||
| # Listen to console events | ||
| page.on("console", lambda msg: console_messages.append(msg.text)) | ||
|
|
||
| print("Navigating to the experiment...") | ||
| await page.goto("http://localhost:8000/code/index.html?condition=control") | ||
|
|
||
| print("Starting the trials...") | ||
| # Screen 1 -> 2 | ||
| await page.click("#btn-consent") | ||
|
|
||
| # Screen 2 -> trial | ||
| await page.click(".btn-familiarity[data-val='2']") | ||
|
|
||
| print("Completing trials...") | ||
| # 6 Trials | ||
| for i in range(6): | ||
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | ||
| # Click the first card | ||
| await page.click(".bento-choice-card:first-child") | ||
|
|
||
| print("Providing justification...") | ||
| # Screen 9 (Justification) | ||
| await page.fill("#semantic-justification", "This is a valid semantic justification.") | ||
|
|
||
| print("Simulating offline environment...") | ||
| # Set context to offline before clicking finalize to simulate network error for Firebase | ||
| await context.set_offline(True) | ||
|
|
||
| print("Finalizing...") | ||
| # Click Finalize -> Screen 10 (executeBatchPayload) | ||
| await page.click("#btn-finalize") | ||
|
|
||
| await asyncio.sleep(1) # wait for the failure catch block to execute | ||
|
|
||
| print("Verifying text content...") | ||
| # Check DOM update | ||
| sync_status_text = await page.inner_text("#sync-status") | ||
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | ||
|
|
||
| print("Verifying localStorage backup...") | ||
| # Check localStorage | ||
| local_storage = await page.evaluate("() => Object.entries(localStorage)") | ||
|
|
||
| backup_key = None | ||
| backup_value = None | ||
| for key, val in local_storage: | ||
| if key.startswith("telemetry_backup_"): | ||
| backup_key = key | ||
| backup_value = val | ||
| break | ||
|
|
||
| assert backup_key is not None, "Backup key not found in localStorage" | ||
|
|
||
| # Verify backup content | ||
| data = json.loads(backup_value) | ||
| assert len(data) == 6, f"Expected 6 rows in Tidy Data schema, got {len(data)}" | ||
| assert "semantic_justification" in data[0], "Semantic justification not appended." | ||
| assert data[0]["semantic_justification"] == "This is a valid semantic justification.", "Semantic justification incorrect." | ||
|
|
||
| print("Verifying console silence...") | ||
| # Verify console | ||
| # Ignore the 404 for firebase-config.js as it's missing from repo | ||
| # We are verifying that experiment.js doesn't log errors | ||
| errors_or_warnings = [msg for msg in console_messages if ("error" in msg.lower() or "warn" in msg.lower()) and "404" not in msg] | ||
| assert len(errors_or_warnings) == 0, f"Found console errors/warnings: {errors_or_warnings}" | ||
|
|
||
| print("All assertions passed. Fallback logic verified successfully.") |
There was a problem hiding this comment.
browser/context are never explicitly closed. If an assertion fails before process teardown, Playwright can leave Chromium processes running and make subsequent runs unstable. Use async with p.chromium.launch(...) as browser: / async with browser.new_context() as context: or close them in finally.
| browser = await p.chromium.launch(headless=True) | |
| context = await browser.new_context() | |
| page = await context.new_page() | |
| console_messages = [] | |
| # Listen to console events | |
| page.on("console", lambda msg: console_messages.append(msg.text)) | |
| print("Navigating to the experiment...") | |
| await page.goto("http://localhost:8000/code/index.html?condition=control") | |
| print("Starting the trials...") | |
| # Screen 1 -> 2 | |
| await page.click("#btn-consent") | |
| # Screen 2 -> trial | |
| await page.click(".btn-familiarity[data-val='2']") | |
| print("Completing trials...") | |
| # 6 Trials | |
| for i in range(6): | |
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | |
| # Click the first card | |
| await page.click(".bento-choice-card:first-child") | |
| print("Providing justification...") | |
| # Screen 9 (Justification) | |
| await page.fill("#semantic-justification", "This is a valid semantic justification.") | |
| print("Simulating offline environment...") | |
| # Set context to offline before clicking finalize to simulate network error for Firebase | |
| await context.set_offline(True) | |
| print("Finalizing...") | |
| # Click Finalize -> Screen 10 (executeBatchPayload) | |
| await page.click("#btn-finalize") | |
| await asyncio.sleep(1) # wait for the failure catch block to execute | |
| print("Verifying text content...") | |
| # Check DOM update | |
| sync_status_text = await page.inner_text("#sync-status") | |
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | |
| print("Verifying localStorage backup...") | |
| # Check localStorage | |
| local_storage = await page.evaluate("() => Object.entries(localStorage)") | |
| backup_key = None | |
| backup_value = None | |
| for key, val in local_storage: | |
| if key.startswith("telemetry_backup_"): | |
| backup_key = key | |
| backup_value = val | |
| break | |
| assert backup_key is not None, "Backup key not found in localStorage" | |
| # Verify backup content | |
| data = json.loads(backup_value) | |
| assert len(data) == 6, f"Expected 6 rows in Tidy Data schema, got {len(data)}" | |
| assert "semantic_justification" in data[0], "Semantic justification not appended." | |
| assert data[0]["semantic_justification"] == "This is a valid semantic justification.", "Semantic justification incorrect." | |
| print("Verifying console silence...") | |
| # Verify console | |
| # Ignore the 404 for firebase-config.js as it's missing from repo | |
| # We are verifying that experiment.js doesn't log errors | |
| errors_or_warnings = [msg for msg in console_messages if ("error" in msg.lower() or "warn" in msg.lower()) and "404" not in msg] | |
| assert len(errors_or_warnings) == 0, f"Found console errors/warnings: {errors_or_warnings}" | |
| print("All assertions passed. Fallback logic verified successfully.") | |
| async with p.chromium.launch(headless=True) as browser: | |
| async with browser.new_context() as context: | |
| page = await context.new_page() | |
| console_messages = [] | |
| # Listen to console events | |
| page.on("console", lambda msg: console_messages.append(msg.text)) | |
| print("Navigating to the experiment...") | |
| await page.goto("http://localhost:8000/code/index.html?condition=control") | |
| print("Starting the trials...") | |
| # Screen 1 -> 2 | |
| await page.click("#btn-consent") | |
| # Screen 2 -> trial | |
| await page.click(".btn-familiarity[data-val='2']") | |
| print("Completing trials...") | |
| # 6 Trials | |
| for i in range(6): | |
| await asyncio.sleep(0.5) # Wait for debounce and DOM updates | |
| # Click the first card | |
| await page.click(".bento-choice-card:first-child") | |
| print("Providing justification...") | |
| # Screen 9 (Justification) | |
| await page.fill("#semantic-justification", "This is a valid semantic justification.") | |
| print("Simulating offline environment...") | |
| # Set context to offline before clicking finalize to simulate network error for Firebase | |
| await context.set_offline(True) | |
| print("Finalizing...") | |
| # Click Finalize -> Screen 10 (executeBatchPayload) | |
| await page.click("#btn-finalize") | |
| await asyncio.sleep(1) # wait for the failure catch block to execute | |
| print("Verifying text content...") | |
| # Check DOM update | |
| sync_status_text = await page.inner_text("#sync-status") | |
| assert "Diagnostic Complete. A network timeout occurred. You may safely close this tab." in sync_status_text, "Status text does not match expected deception message." | |
| print("Verifying localStorage backup...") | |
| # Check localStorage | |
| local_storage = await page.evaluate("() => Object.entries(localStorage)") | |
| backup_key = None | |
| backup_value = None | |
| for key, val in local_storage: | |
| if key.startswith("telemetry_backup_"): | |
| backup_key = key | |
| backup_value = val | |
| break | |
| assert backup_key is not None, "Backup key not found in localStorage" | |
| # Verify backup content | |
| data = json.loads(backup_value) | |
| assert len(data) == 6, f"Expected 6 rows in Tidy Data schema, got {len(data)}" | |
| assert "semantic_justification" in data[0], "Semantic justification not appended." | |
| assert data[0]["semantic_justification"] == "This is a valid semantic justification.", "Semantic justification incorrect." | |
| print("Verifying console silence...") | |
| # Verify console | |
| # Ignore the 404 for firebase-config.js as it's missing from repo | |
| # We are verifying that experiment.js doesn't log errors | |
| errors_or_warnings = [msg for msg in console_messages if ("error" in msg.lower() or "warn" in msg.lower()) and "404" not in msg] | |
| assert len(errors_or_warnings) == 0, f"Found console errors/warnings: {errors_or_warnings}" | |
| print("All assertions passed. Fallback logic verified successfully.") |
|
Closing: Fixes have already been incorporated directly into the master branch. |
Understood. Acknowledging that this work is now obsolete and stopping work on this task. |
🎯 What: The
executeBatchPayloaderror handling was missing robust testing and thelocalStoragefallback mechanism was only a comment placeholder. Furthermore, error messages were logged to the console, violating the silent client directive, and usinginnerHTMLto render error objects posed an XSS risk.📊 Coverage:
executeBatchPayload.localStorageunder the exact schema needed.✨ Result: A robust, tested, and secure fallback recovery mechanism for telemetry data, adhering perfectly to the architectural directives requiring vanilla JS client code tested via external DOM manipulation and network interception. Test coverage for the critical error path is now established.
PR created automatically by Jules for task 13120139074774822868 started by @hashexplaindata